Abstract:Few-shot point cloud semantic segmentation struggles to handle complex structures, ambiguous semantics and noise interference due to its limitations in global context modeling, feature alignment and semantic guidance. To address these issues, a multi-level global-aware model for few-shot 3D point cloud semantic segmentation(MGAM) is proposed. On the basis of multi-level window partitioning, the receptive field is progressively expanded to achieve collaborative modeling of local geometry and global semantics. A dual-domain attention fusion module(DAFM) is designed. Channel attention and point-wise attention are integrated to fuse local and global information. A global auxiliary point mechanism(GAPM) is constructed. Learnable global points are embedded into key layers to enhance cross-layer feature propagation. A global category-aware loss(GCAL) is developed. Category distribution constraints are added to point-level supervision to highlight small-class targets. Ablation experiments verify the effectiveness and complementarity of each module. Comparative experiments show superior performance of MGAM, particularly in complex scenarios.
严唯书, 袁健, 杨明睿, 许嘉汇. 面向小样本3D点云语义分割的多层次全局感知模型[J]. 模式识别与人工智能, 2026, 39(3): 225-238.
YAN Weishu, YUAN Jian, YANG Mingrui, XU Jiahui. Multi-level Global-Aware Model for Few-Shot 3D Point Cloud Semantic Segmentation. Pattern Recognition and Artificial Intelligence, 2026, 39(3): 225-238.
[1] CHARLES R Q, SU H, KAICHUN M, et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation // Proc of the IEEE Conference on Computer Vision and Pattern Reco-gnition. Washington, USA: IEEE, 2017: 77-85. [2] BEHLEY J, GARBADE M, MILIOTO A, et al. SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 9296-9306. [3] RYOO M S, PIERGIOVANNI A J, ARNAB A, et al. TokenLear-ner: What Can 8 Learned Tokens Do for Images and Videos[C/OL]. [2025-11-07]. https://arxiv.org/pdf/2106.11297. [4] WANG Z Y, NGUYEN C, ASENTE P, et al. PointShopAR: Su-pporting Environmental Design Prototyping Using Point Cloud in Augmented Reality // Proc of the CHI Conference on Human Fa-ctors in Computing Systems. New York, USA: ACM, 2023. DOI: 10.1145/3544548.358077. [5] JIANG L, YANG Z T, SHI S S, et al. Self-Supervised Pre-training with Masked Shape Prediction for 3D Scene Understanding // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Re-cognition. Washington, USA: IEEE, 2023: 1168-1178. [6] 佟国峰,刘永旭,彭浩,等.基于编码特征学习的3D点云语义分割网络.模式识别与人工智能, 2023, 36(4): 313-326. (TONG G F, LIU Y X, PENG H, et al. 3D Point Cloud Semantic Segmentation Network Based on Coding Feature Learning. Pattern Recognition and Artificial Intelligence, 2023, 36(4): 313-326.) [7] ZHAO N, CHUA T S, LEE G H. Few-Shot 3D Point Cloud Semantic Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2021: 8869-8878. [8] QI C R, YI L, SU H, et al. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space // Proc of the 31st International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2017: 5105-5114. [9] ZHANG Z Y, HUA B S, YEUNG S K. ShellNet: Efficient Point Cloud Convolutional Neural Networks Using Concentric Shells Statistics // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 1607-1616. [10] JIANG M Y, WU Y R, ZHAO T Q, et al. PointSIFT: A Sift-Like Network Module for 3D Point Cloud Semantic Segmentation[C/OL]. [2025-11-07]. https://arxiv.org/pdf/1807.00652. [11] ZHAO H S, JIANG L, FU C W, et al. PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 5560-5568. [12] THOMAS H, QI C R, DESCHAUD J E, et al. KPConv: Flexible and Deformable Convolution for Point Clouds // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 6410-6419. [13] WANG Y, SUN Y B, LIU Z W, et al. Dynamic Graph CNN for Learning on Point Clouds. ACM Transactions on Graphics (TOG), 2019, 38(5). DOI: 10.1145/3326362. [14] YAN X, ZHENG C D, LI Z, et al. PointASNL: Robust Point Clouds Processing Using Nonlocal Neural Networks with Adaptive Sampling // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 5588-5597. [15] HU Q Y, YANG B, XIE L H, et al. RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 11105-11114. [16] ZHAO H S, JIANG L, JIA J Y, et al. Point Transformer // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 16239-16248. [17] MA X, QIN C, YOU H X, et al. Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual MLP Framework[C/OL]. [2025-11-07]. https://arxiv.org/pdf/2202.07123. [18] HE S T, JIANG X D, JIANG W, et al. Prototype Adaption and Projection for Few-and Zero-Shot 3D Point Cloud Semantic Segmentation. IEEE Transactions on Image Processing, 2023, 32: 3199-3211. [19] WANG J H, ZHU H Y, GUO H R, et al. Few-Shot Point Cloud Semantic Segmentation via Contrastive Self-Supervision and Multi-resolution Attention // Proc of the IEEE International Conference on Robotics and Automation. Washington, USA: IEEE, 2023: 2811-2817. [20] ZHANG C Y, WU Z Y, WU X Y, et al. Few-Shot 3D Point Cloud Semantic Segmentation via Stratified Class-Specific Attention Based Transformer Network. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(3): 3410-3417. [21] XU Y T, ZHAO N, LEE G H.Towards Robust Few-Shot Point Cloud Semantic Segmentation[C/OL]. [2025-11-07].https://papers.bmvc2023.org/0081.pdf. [22] AN Z C, SUN G L, LIU Y, et al. Rethinking Few-Shot 3D Point Cloud Semantic Segmentation // Proc of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2024: 3996-4006. [23] LIANG Y D, AN P, LIU Q, et al. Density-Aware Few-Parametric Networks for Robust Few-Shot Point Cloud Semantic Segmentation. Neurocomputing, 2025, 665. DOI: 10.1016/j.neucom.2025.132158. [24] LI Z Y, WANG Y, XIONG G X, et al. Generalized Few-Shot Point Cloud Segmentation via LLM-Assisted Hyper-Relation Ma-tching // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2025: 23063-23073. [25] WANG C S, HE S T, FANG X, et al. Taylor Series-Inspired Local Structure Fitting Network for Few-Shot Point Cloud Semantic Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 2025, 39(7): 7527-7535. [26] LIU Z, LIN Y T, CAO Y, et al. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 9992-10002. [27] LAI X, LIU J H, JIANG L, et al. Stratified Transformer for 3D Point Cloud Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 8490-8499. [28] VASWANI A, SHAZEER N, PARMAR N, et al. Attention Is All You Need[C/OL]. [2025-11-07]. https://arxiv.org/pdf/1706.03762. [29] ARMENI I, SENER O, ZAMIR A R, et al. 3D Semantic Parsing of Large-Scale Indoor Spaces // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 1534-1543. [30] DAI A, CHANG A X, SAVVA M, et al. ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 2432-2443.